Fix attempting to combine Hangul Jamo 0x11a7 (#317)
authorDiego Frias <mail@dzfrias.dev>
Sat, 22 Nov 2025 18:42:18 +0000 (12:42 -0600)
committerGitHub <noreply@github.com>
Sat, 22 Nov 2025 18:42:18 +0000 (13:42 -0500)
commit0260ba56c81e5ef6f06c0804034a36284bcb8710
tree33dc7ccf0220a4d02b16768030e31d5f2f054861
parent3460568643dfa7a12180d918ff80a31f51444676
Fix attempting to combine Hangul Jamo 0x11a7 (#317)

* Fix attempting to combine Hangul Jamo 0x11a7

0x11a7 is not a valid Hangul T syllable despite being equal to T_BASE.
This is because, per the Unicode spec:

  TCount is set to one more than the number of trailing consonants
  relevant to the decomposition algorithm: (0x11C2 - 0x11A8 + 1) + 1

So the first valid Hangul T syllable is 0x11a8. Also see
https://www.unicode.org/versions/Unicode17.0.0/core-spec/chapter-3/#G59434
for where the spec describes the usage of 0x11a8, not 0x11a7, during
composition.

* document that utf8proc_map simply wraps utf8proc_decompose and utf8proc_reencode (#312)

* test code refactoring (#318)

* Write regression test for #317

---------

Co-authored-by: Steven G. Johnson <stevenj@alum.mit.edu>
test/misc.c
utf8proc.c